Latent Semantic Analysis for Russian Literature Investigation

نویسنده

  • P. I. Nakov
چکیده

The paper presents the results of experiments of usage of Latent Semantic Analysis for analysis of textual data. The method is explained in brief and special attention is pointed on its potential for comparison and investigation of Russian literature texts. Two hypotheses are tested: • The texts by the same author are alike and can be distinguished from the ones by different person; • The prose and poetry can be automatically discovered.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Latent Semantic Analysis for German Literature Investigation

The paper presents the results of experiments of usage of LSA for analysis of textual data. The method is explained in brief and special attention is pointed on its potential for comparison and investigation of German literature texts. Two hypotheses are tested: 1) the texts by the same author are alike and can be distinguished from the ones by different person; 2) the prose and poetry can be a...

متن کامل

Query expansion based on relevance feedback and latent semantic analysis

Web search engines are one of the most popular tools on the Internet which are widely-used by expert and novice users. Constructing an adequate query which represents the best specification of users’ information need to the search engine is an important concern of web users. Query expansion is a way to reduce this concern and increase user satisfaction. In this paper, a new method of query expa...

متن کامل

Unsupervised Topic Modeling for Short Texts Using Distributed Representations of Words

We present an unsupervised topic model for short texts that performs soft clustering over distributed representations of words. We model the low-dimensional semantic vector space represented by the dense distributed representations of words using Gaussian mixture models (GMMs) whose components capture the notion of latent topics. While conventional topic modeling schemes such as probabilistic l...

متن کامل

Preliminary Experiments on Literature Based Discovery using the Semantic Vectors Package

This paper presents a literature based discovery (LBD) implementation that uses Lucene for indexing, the Semantic Vectors (SV) package for latent semantic analysis, Neo4j for graph database storage, Gephi for visual representation along with custom code written by the author. The approach of using a latent semantic analysis based systems like SV to do LBD is not new, but going the next steps of...

متن کامل

Contextual analysis in word-for-word MT

EXPERIMENTS with word-for-word MT of Russian scientific literature have given results which, except for such limited purposes as indexing, are far from satisfactory. The difficulty is not so much one of word order as of syntactic and semantic ambiguity of individual words. Regardless of the treatment of the problem of inflected forms, for example, it is impossible in the majority of instances t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001